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Objectives 


The principal objective of this unit is to discuss the problem of solving 
a system of simultaneous linear equations and, in particular, to discuss the 
accuracy of the results obtained and the efficiency of the method used. 


After working through this unit you should be able to: 


(i) explain what is meant by the following terms: 
error vector, 
ill-conditioned system of equations, 
nearly singular matrix, 
norm of a vector; 

(ii) determine the efficiency of the Gauss elimination method or a matrix 
inversion method, by determining the number of specific arithmetic 
operations required ; 

(iii) rearrange a given matrix equation 


Ax=b 


into a suitable form which guarantees the convergence of an iterative 
method for the solution ; 


(iv) determine whether a given (simple) system of simultaneous equations 
is ill-conditioned. 


Note 


Before working through this correspondence text, make sure you have 
read the general introduction to the mathematics course in the Study 
Guide, as this explains the philosophy underlying the whole course. 
You should also be familiar with the section which explains how a text 
is constructed and the meanings attached to the stars and other symbols 
in the margin, as this will help you to find your way through the text. 
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Glossary 


Terms which are defined in this text are printed in CAPITALS. 


DIRECT METHOD 


ERROR VECTOR 


ILL-CONDITIONED 
SYSTEM OF 
EQUATIONS 


INDIRECT METHOD. 


MATRIX OF 
COEFFICIENTS 


NEARLY SINGULAR 
MATRIX 


NORM OF A VECTOR 


SPARSE MATRIX 


vi 


A DIRECT METHOD of solving a system of linear 
simultaneous equations is a method which produces 
an explicit result for the solution. 


The ERROR VECTOR for a system of linear equations 
is 


(estimated solution vector) — (solution vector). 


An ILL-CONDITIONED SYSTEM OF EQUATIONS is a 
system for which small changes in the coefficients 
produce large changes in the elements of the solu- 
tion set. 


An INDIRECT METHOD in the context of this unit is 
an iterative method. 


The MATRIX OF COEFFICIENTS of the system of 
equations 


y1Xy + Ay2Xp +++ 


Gm1X1 + AmaX2 + +++ + AnnX_ = Dy 


yy Aya, Gay 
Ga, 22 +" ay 
Gms Ama inn 


A NEARLY SINGULAR MATRIX is a non-singular 
matrix for which small changes in the elements 
would produce a singular matrix. 


A NORM OF A VECTOR is a numerical measure of 
the “size” of a vector. 


A SPARSE MATRIX is a matrix in which the elements 
are predominantly zero. 
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Notation 
The symbols are presented in the order in which they appear in the text. 


A 
x 
Cc 
AY 
Xi 
ay 
ew 
x” 


x 


rau) 
(r) 


i 
llall 


Anon-singular matrix of coefficients. 

A vector. 

The set of all complex numbers. 

The inverse of A. 

An element of the vector x. 

An element of the matrix of coefficients A. 

The rth approximation to the solution vector in an iterative method. 
An element of x”. 

The exact solution vector. 

The error vector: ( x” — X). 
An element of e. 

A norm of the vector a 


23 
24 


where “norm” is in general defined by properties (i) to (iv) on page 24. In this unit, the 


particular norm which we assume, unless otherwise stated, is defined by 


\lal|=la,|+|a,|+---+]a,|, where a is an nx1 vector with elements a;, i=1,..., n. 
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28.0 INTRODUCTION 


This unit considers in detail just one aspect of the work discussed in 
Unit 26, Linear Algebra III. We are going to look at some practical 
methods of solving a set of simultaneous linear equations, and some of 
the difficulties involved. We regard this problem as a vehicle for discussing 
some of the methods and practices of numerical calculation: it is another 
step in the numerical analysis aspect of this course, which we began in 
Unit 2, Errors and Accuracy. 


We begin by stating the problem precisely. 
The matrix equation 
Ax = b, 


where x and b are n-element column vectors and A is an n x n matrix, 
represents a system of n simultaneous linear equations in n variables 
(unknowns). If A is a non-singular matrix, that is, its rank is n, then the 
matrix equation has a unique solution. This is the only type of equation 
we shall deal with in this unit, since one of our aims is to examine how 
the solution is derived, particularly for large matrices. Thus we shall 
assume that a unique solution exists ; our problem is to determine it. 


If you have solved simultaneous equations already, you have probably 
solved systems of 2 or 3 or perhaps 4 equations, In your working you 
endeavoured to avoid blunders, and, as a final check, you substituted 
your answers back into the original equations. You were not interested 
in comparing different methods of solution to see if some were quicker 
than others, nor were you concerned with the accuracy of the solution 
other thin in the sense that it satisfied the equation. With systems of 
several thousand equations, time can be an important factor, and we 
shall see that there is more to accuracy than correct arithmetical calcula- 
tions. When you solved your equations, you almost certainly used a 
direct method, i.e. one that leads step by step to the answer, which appears 
at the last step(s); for instance, a method like the Gauss elimination 
method which we discussed in Unit 26. We shall look at that method 
again, along with two other direct methods, in section 28.1. 


When you solved a small system of equations, you probably never even 
contemplated an iterative method (i.e. a method in which at each step 
we obtain another approximation to the solution), since the direct 
method was so quick and reliable. In larger systems with certain character- 
istics, iterative methods can be useful. We look at these methods in 
section 28.2 and we see how they link together the ideas of the iterative 
methods of solving equations (described in Unit 2, Errors and Accuracy) 
and some of the vector space concepts discussed in the earlier units on 
linear algebra. 


Finally, we come to the problem of accuracy. Frequently, the data we 
use to set up the equations are inaccurate, and so the eventual result 
may be in error, not only because of round-off errors which may have 
built up during the computation, but also from inaccuracies propagated 
from the very start. In certain special cases this inaccuracy makes the 
results almost worthless. It is this aspect of the solution of simultaneous 
equations, called ill-conditioning, which we shall look at in section 28.3. 


28.1 DIRECT METHODS 
28.1.0 Introduction 


Mathematicians like to be able to record an explicit solution to an 
equation. The oft-quoted solution to the quadratic equation 


ax? + bx +c = 0, 


namely 


yw cbt VP tae 


2a 


is an example of this. The desired variable is to the left of an “equals” 
sign, and the letters to the right of the “equals” sign can be replaced 
by numbers from the data in any specific example. The solution consists 
of the two numbers which map to zero under the function 


xe—ax?+bx+e (xeC). 
Similarly, we can write down an explicit solution to the matrix equation 
Ax = b, 
where A is non-singular, in the form 
x=A7'b, 
the solution vector being the unique vector which maps to b under the 
mapping 
xr— As (xe R"). 


‘A method which uses such an explicit formula to obtain the answer is 
called a direct method. Not all direct methods, however, use formulas to 
calculate the answer. For instance, the Gauss elimination method 
(described in Unita is a direct method, but we do not use a formula for 
the answer, but a step by step procedure (which could be specified by an 
algorithm) to get from the data to the answer. This is the characteristic 
of a direct method (as opposed to an iterative method, which we discuss 
later): it is a method of obtaining the answer to a problem from the 
data by a step by step procedure, the answer (or an approximation to 
the answer) being obtained only at the end of the procedure, there being 
no approximations to the answer en route. 


To calculate the solution vector to a system of equations using the formula 
x = A~b, we must determine A~! and post-multiply by b. We examine 
the efficiency of this particular process in section 28.1.3. We shall also 
compare its efficiency with the efficiency of two other methods: the 
Gauss elimination method, which will be examined in section 28.1.2, 
and one other direct method, which we look at in section 28.1.1. Finally, 
we must emphasize that all three methods of solution are algebraically 
equivalent, since the solution is unique. This means that if we wrote 
them out as explicit formulas, we could derive one formula from the 
other by algebraic manipulation. In this unit we are interested in com- 
paring the computational methods and the lengths of time taken. 


FM 28.1.0 


Definition 1 


28.1.1 An Explicit Method 


By appropriate manipulation (for example, eliminating x, between the 
equations, and then solving for x,), the solution of 


Oy 1X + Gy2%2 = by 
ayX + Gaa%2 = ba 


may be written as 


x = bidaa batts — andy 
fat la ’ S aad . 
11422 — 44243) 411422 — Gy20a4 


provided that 
441422 — 41242, # 0. 


(All coefficients a,,,a,2,... and b,,b>,... used in this text are assumed 
to be real numbers.) In this case the solution we have found is the set of 
components of the unique vector x which maps to b under the mapping 


xr—Ax (xe R?). 


Thus, at one step, provided the condition is satisfied, we have formally 
written down the solution to all pairs of simultaneous equations in two 
variables, which have a unique solution, 


We can go on to find that for a system of three simultaneous linear equa- 
tions: 


4X + Ay2Xq + Ay3X3 = by 
4X1 + Ay2Xq + y3X3 = by 
3X1 + dy2Xz + ygXy = by 


the solution set may be written as 
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Main Text 


bi(@z2433 = Gay32) — bafdy2433 — 413432) + b3(4i2423 ~ 413422) 


x= 


together with similar expressions for x, and x, (with the same denomina- 
tor), provided, of course, that the denominator does not equal zero, (If 
you have plenty of time to spare, you may like to verify the above expres- 
sion by solving the system of equations yourself. But don’t get too bogged 
down by the algebraic manipulations.) 


Again, we can see that, by simple substitution of numbers for the letters, 
we can solve any system of the above form which has a unique solution. 
We could, in theory, solve a system of simultaneous linear equations of 
any order by developing the appropriate formulas. Why then do we not 
stop here and simply solve simultaneous linear equations this way? 
The reason is that the expenditure of time and effort involved is too great. 


Let us consider the number of multiplications and divisions involved in 
solving the three simultaneous linear equations. We concentrate on the 
operations of multiplication and division, since these tend to be more 
time-consuming than addition and subtraction, both for manual and 
computer operations. 


The expressions for x,, x, and x; have the same denominator, so this 
has to be calculated just once. Each numerator has the same form as the 
denominator, and there are 3 numerators. Thus we need to calculate 4 
expressions of the form: 


4 4422433 — 3432) — Ay4(Ay 2433 ~ 41343) + 434 (12423 — 43422). 


Ay 4(4y233 — A332) — Ap 4(4y 233 — 4 3432) + 431442423 — 443422) 


Each bracket contains 2 products. Evaluation of the whole expression 
therefore requires 9 multiplications. 


Total number of multiplications is 4 x 9 = 36 
Number of divisions 3 
Therefore total number of time-consuming operations = 39 


You may have noticed that three bracketed terms in one numerator (that 
of x, in our example) are the same as the bracketed terms in the denomina- 
tor. Using this, we could save 6 multiplications and would therefore 
require only 33 time-consuming operations. 


Exercise I 


What is the total number of operations of multiplication and division 
required to solve a system of two simultaneous linear equations by 
substitution in the formulas given in the text? a 


A general expression can be derived for the number of operations of 
multiplication and division for the solution of n equations in n unknowns, 
by this method; it is given by the sequence 


For large values of n, the nth term of this sequence, u,, can be shown to be 
approximately equal to 


1.72 xn x nl! 


This enables us to produce the table below. The final column gives a 
rough idea of the time it would take an automatic computer to solve the 
problem if, for example, each operation were to take one microsecond 
(1 microsecond = 10° s; it is denoted by 1 ys). 


Number of Number of 

simultaneous equations operations Time taken 

2 8 8 us 

3 33 33 us 

4 168 168 pis 

10 ~62 x 10’ 625 
100 =1,6 x 101°° ~10'47 year 
1000 ~69 x 107579 | ~10?557 year 


This shows the tremendous increase in the number of operations and the 
consequent time required as n increases; a more efficient method is 
obviously desirable. 


Optional Section on Determinants 


If you wish, you may omit this section, which is not essential to the 
development of this unit or the course. It is included because it briefly 
discusses determinants, which have in the past been closely associated 
with the solution of simultaneous linear equations. 
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Exercise 1 
(2 minutes) 


The expression 
414422 — 44249; 
is called the determinant of the matrix 
as i a 
7 Nagy aaa)’ 


and is written as 


A 


Oya, “A 42) 


Mo, 22 


or det A, or |A,|, so that the solution of two equations in two variables 
may be written as 


by aya 
by a, 

5 a 
41 Gia 
ae zal 


Similarly, the expression 
y (422433 — Ap3y2) — 4G 2433 — Ay 3432) + A34(4 42423 — 443422) 
is called the determinant of the matrix 
Gy, Ayn Ay 
Ay =| 421 422 423}, 
3, G32 M33 
and is written as 
M1 S12 O13 
42; 422 G23 
a3, G32 33 


or det A, or|A,|,so that the solution of three equations in three unknowns 
may be written as 


by a2 ais 
ba 422 423 


by 32 433) 


x,= ' 
4 
My, Ayr Ay 


421 922 a3 
431 G32 433 
etc. 


The determinant of a square matrix is a certain number associated with 
the matrix. Essentially, its use at this level in mathematics is confined to 
being a shorthand for expressing the solutions of simultaneous linear 
equations. Since actual calculations using the formula for the solution, 
which are equivalent to evaluating the determinants, become much too 
cumbersome for systems of more than 3 or 4 simultaneous linear equa- 
tions, we have not made determinants an integral part of the course. The 
number of equations for which the use of determinants is, so to speak, 
at its peak performance is only three. 
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Solution 1 


8 a 


28.1.2 The Gauss Elimination Method 


In Unit 26, we introduced the Gauss elimination method of solving 
systems of equations. The objective of the method is to produce an 
equivalent set of equations which are easier to solve than the original set. 
We now ask you to consider this method again and to check how efficient 
it is in terms of the number of operations it takes to produce the answer, 
since for large systems of equations the method discussed in section 28.1.1 
is clearly unacceptable. (Even if somebody had started a modern com- 
puter solving 100 simultaneous equations by that method at the beginning 
of this century, it would still not have made a dent in the problem: it 
would not even have performed 107 '°° % of the calculations.) 


Exercise 1 


Solve the following set of three equations by the Gauss elimination 
method with back substitution, in the order shown. Calculate on rough 
paper and then use the spaces provided to fill in the steps in your solution. 
The numbers entered should be expressed as decimals, not fractions. 
Note (in the column at the side) the number of divisions and multiplica- 
tions that have occurred. What is the total number of these operations 
for the whole solution? 


5x, + 2x, —2x,;=8 (1) 
3x, + 6x2 — 4x3 =1 (2) 
2x, + 4x2 — 2x3 =1 (3) 
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Exercise 1 
(3 minutes) 
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Step in the Calculation 


Number of Multiplications 
and/or Divisions 


Eliminate x, from equations (2) and (3). 


Factor for equation (2) is 


Factor for equation (3) is 


5x, + 2x2 + (—2) x= 8 
x2 + x3 = 
X+ x3 = 


Now eliminate x, from equation (5). 


Factor for equation (5) is 


5x, +2x,+(-2) %x,= 8 
Na+ x3 = 
x3 = 
Therefore 
x= 


() 
(4) 


(5) 


(y 
(4) 


(6) 


and back-substituting into equation (4) gives 


x2 = 
and from equation (1) 
x= 
=e 
Total number of divisions and multiplications 
a 


Solution I 
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Solution 1 


Step in the Calculation 


Number of Multiplications 
and/or Divisions 


Eliminate x, from equations (2) and (3). 


Factor for equation (2) is —0.6 

Factor for equation (3) is -04 

Suit Oxg (=2) x=, 8 a) 
48. sesh SOG) eae. (4) 


32) x, Sia gym 2d (5) 
Now eliminate x, from equation (5). 
Factor for equation (5) is —0.66 


Sx, + 2x2 + (-2) xs 8 () 


48 Xa+ -28 x= -38 (4) 


033 (6) 


e 
a 
z. 
Ry 
f] 


Therefore 
x,= O05 

and back-substituting into equation (4) gives 
x,= -05 

and from equation (1) 


x= 2 


1 division 


1 division 


3 multiplications 


3 multiplications 


1 division 


2 multiplications 


1 division 


1 multiplication and 1 division 


2 multiplications and 1 division 


Total number of divisions and multiplications 


17 


Notice that the number of multiplications and divisions is roughly half 
the corresponding number for the method considered in section 28.1.1. 


Next we try to determine the amount of computation required to solve 
a general system of equations by the Gauss elimination method. We 
attempt an analysis similar to the one given in the last exercise. We then 
compare the labour involved in this method with the labour involved 
in the method of section 28.1.1. 


Consider the first two equations of a system of n equations: 


Gy yXy + AyaX2 + +++ + Aanky = Oy 
4p4Xy + Ag2X2 + +++ + GayXy = b2 


We assume that a,, is non-zero. If it were not, then we could alter the 
order of the equations until we found a coefficient of x, which was non- 
zero. In hand calculation, such a re-ordering does not involve an important 
time factor, but in an automatic machine calculation it would, of course, 
have to be taken into account. To eliminate x, from the second equation 
requires the calculation of the quotient @),/a,, and then the formation 
of the n numbers 


Gn Gay Gar boc St 
yz — G42, 23 — 354464 Aay — Ags bz — —by. 
a ay ay ay 


Thus the formation of the new second equation requires (n + 1) opera- 
tions of multiplication or division. In the next exercise we ask you to 
calculate the total number of multiplications and divisions required. 


a: - . 
We assume that a. — a 4, is non-zero for the next round, since we 
shall have to divide by it to produce the next quotient corresponding to 


a 
1 that is, 
ayy 


CEY 
a1 


423 — ——-ay2 
ayy 


If it were zero, we could alter the order of the equations until we found a 
coefficient of x which was non-zero. (What would you infer if all sub- 
sequent coefficients of x, were zero?) 


In general, in the discussion in this text we shall assume that we can 
perform the divisions as we come to them. 


Exercise 2 


Fill in the following table to calculate the number of multiplications and 
divisions involved in solving a system of n equations in n unknowns by 
the Gauss elimination method. The formulas derived in Unit 4, Finite 


ns n(n + 1) 

2 
_ n(n + 1)(2n + 1) 
See 


will be useful. 
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Step in the Calculation 


Number of Multiplications/Divisions 


Eliminate x, from all equations after the first. 
Eliminate x, from all equations after the second. 


Eliminate x, -, from all equations after the 
(n — 1)th, i.e. the nth equation. 


Total for elimination is 


Now carry out the back substitution. 


To determine x, from an equation such as 
GyXn = Br, 
To determine x,—, from an equation such as 


Yn=1¥n—1 + Yon = Br=1 


To determine x, 


Total for back substitution is 


(n = 1)(n + 1I)= 


Therefore total number of operations is 


Compare your answer with the answer to Exercise 1. a 
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We are now in a position to compare the explicit solution and the Gauss Discussion 
elimination method from the viewpoint of efficiency. The results are 
tabulated below. 


Formula method | Gauss elimination method 
No. of operations is No. of operations is 
Number of 1.72n x n} We bon 
equations (for large n) 4 +05 
S 33: 17 
4 168 36 
10 ~6.2 x 107 430 
100 ~1.6 x 101° ~34 x 10° 
1000 ~6.9 x 10757° ~33 x 10° 


This table demonstrates dramatically why the formula method is never 
used for large systems of equations. 


In fact, the formula method is never quicker than the Gauss elimination 
method. For a quick comparison of the methods for large n, we would 
say that the number of operations for the Gauss elimination method is 


3 
approximately = since, when n > 100 say, the rest of the terms affect 


the number of operations by less than three per cent. 


Solution 2 
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Solution 2 


Step in the Calculation 


Number of Multiplications/Divisions 


Eliminate x, from all equations after the first. 
Eliminate x, from all equations after the second, 


Eliminate x, _ , from all equations after the 
(n — 1)th, ie. the nth equation. 


Total for elimination is 


Now carry out the back substitution. 


To determine x, from an equation such as 
Gyn = By 
To determine x, , from an equation such as 


Yn=1¥n—1 + YnXn = Bama 


To determine x, 


Total for back substitution is 


(n—1)(n + 1)=n? -1 


(n-—2n =(n-1)? -1 


i} 


[feo 


=2 


nn + 1)(2n + 1) _ 
6 


i}-@-» 


Therefore total number of operations is 
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28.1.3. A Matrix Inversion Method 28.1.3 
In Unit 26 we described a numerical method for finding the inverse Main Text 


of ann x n matrix; the method is very similar to the Gauss elimination 
method. We pointed out that if we had to solve several systems of equa- 
tions of the form 


Ax =b, 


in which the matrix A was the same in each case, and only the matrix b 
changed, then there might be some value in computing A! and calculat- 
ing the solution from the formula 


x=A'b 
for each case. 


We shall now look at the number of calculations involved and see just 
when this method is of “some value”. 


Just to remind you of the method, we describe the essential steps. Suppose 
we want to invert the matrix 


isl > 
Sei=oile 
eats it 
We write 
een rane!) = Teel 
6-1 -910 1 0 
ee Sa te 


and use elementary row operations to turn this array into 
1 oO Ol x x x 
0 1 0 
0 0 1 


x x x 


\ 
| 
\ 
ke ees 
where the crosses represent the elements of the inverse matrix. We have 
ignored the row-sum check : this would be involved both here and in the 
Gauss elimination method, so for a comparison between the two methods 
we can ignore it, 


We begin by dividing the first row of our array of numbers by 2 in order to 
obtain a 1 in the top left-hand corner. This involves 3 divisions. We 
obtain 


t 4+ <b 0 0 
6-1 910 1 0 
We oe Se cla Ome eet 


Next we subtract 6 times the first row from the second and 4 times the 
first row from the third, This involves 2 x 3 multiplications. We obtain 

Oe emcees 1 20) 

i 

0 -—4 -—6 ' 3 1 0 

OF hse Oe 
Thus to get the first column into the required form we have performed 
9 time-consuming operations. 


In general, to compute the inverse of ann x n matrix, we form the corre- 
sponding n x 2n matrix by “joining” on the n x n unit matrix. 


To produce 
1 
0 


0 
in the first column requires n divisions to produce the | and (n — 1) x n 
operationsto produce the 0's. Thatmeans werequiren + (n — 1) x n =n? 
time-consuming operations to derive the first column. Performing the 
manipulations to produce each subsequent column of an n x n unit 
matrix also requires n? operations. Since there are n columns, the total 
number of operations required is n°. 


Having found the inverse matrix, we then have to calculate A~'b, which 
involves a further n? multiplications. 


Thus the total number of operations to obtain a solution is 
m+n, 
compared with 


: 
n 2 
oa eas 


for the Gauss elimination method. 


Number of | Gauss elimination Inverse method 
equations | No. of operations | No. of operations 


3 17 36 

4 36 80 

10 430 1100 
100 ~3.4 x 10° ~3 x 34x 10° 
1000 ~3.3 x 10° ~3 x 3.3 x 108 


For large n, the n° term is much larger than the rest, so that the inverse 
method is roughly three times as long as the Gauss elimination method. 
Thus, when there are three or more systems of simultaneous equations 
to solve with the same A but different b’s, it is usually quicker to find A~! 
first, and use it for all the different vectors b, rather than calculate each 
solution by the Gauss elimination method. 


Exercise 1 


How many multiplications are required to multiply together two n x n 
matrices? a 
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Exercise 1 
(3 minutes) 


28.1.4 Summary 


In this section we have measured the efficiency, in terms of time for 
computation, of three direct methods of solving systems of n equations 
(in n unknowns) by investigating the number of operations of multiplica- 
tion and division required for each. For large n, we found the following 


results. 


Approximate 


A method using 
the inverse of a 
matrix. 


number of Usefulness for 
operations large n 
Explicit method 1.72n x n! Ofno use whatsoever. 
of section 28.1.1. 
Gauss elimination ne Widely used when 
method. gi there are no more 


than 3 systems with 
the same left-hand 
side. 


Widely used when 
there are more than 
3 systems with the 
same left-hand side. 


The methods considered in section 28.1 are basic methods. There are 
many refinements to them for particular problems, some of which are 
listed, for example, in Fox, Numerical Linear Algebra, Chapters 3 and 4 


(see Bibliography). 
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Solution 28.1.3.1 Solution 28.1.3.1 


Each element of the product matrix requires n multiplications. 


There are n? terms in the product matrix. Therefore the number of 


multiplications is n°. a 
28.2, ITERATIVE OR INDIRECT METHODS 28.2 
28.2.0 Introduction 28.2.0 


Theoretically, given enough time and ignoring round-off errors which Introduction 
may occur in the computation, it is possible to solve exactly a matrix 
equation 


Ax = b, 


with exact elements a;, and b;,, by one of the direct methods described 
in section 28.1. It may therefore appear to be pointless to consider any 
other method, such as an iterative method in which we make a guess at the 
solution and then refine it, However, iterative methods are important; 
we discuss them for the following reasons. First, it often happens that, 
when there is a large system of equations to be solved, a large number of 
the elements of the matrix of coefficients are zero ; for example, 


xT x) 0) ke) 
0 


x x x 0 0 
Oo eo Olen) 0) 
x 0 0 x 0 x] 
x 00 0x 0 


QO CxO 10 


where the x's represent non-zero numbers. Such matrices are called 

sparse matrices (for obvious reasons). An iterative method, by taking Definition 1 
account of the zeros from the outset, can sometimes give the result much se 
more quickly than a direct method. Of course, if the pattern of zeros 

were convenient, for example, if we started with 


AN 


then the back-substitution in the Gauss elimination method would be 
available straight away. The point is that the iterative method does not 
depend on a convenient pattern. Secondly, the iterative method, if it 
converges, improves the accuracy of the approximate solution at each 
stage. As such it can be used, after a first crude approximation by a 
direct method, to improve the accuracy of the solution. There may be 
other advantages in terms of time and economy in the use of space in a 
digital computer, but these would need closer analysis. In terms of 


automatic computation, iterative methods often have the advantage that 
various stages in the iteration repeat the same relatively simple process, 
which is ideal for economic and efficient programming. The iterative 
method is also of interest from a purely mathematical viewpoint; it 
shows how the ideas of the mapping of numerical error intervals, intro- 
duced in Unit 2, Errors and Accuracy, can be extended to the discussion 
of vectors. 


28.2.1 Some Methods of Iteration 


We recapitulate briefly on the iterative method for solving equations 
defined on the set of real numbers, which we discussed in Unit 2, Errors 
and Accuracy. To solve the equation 


x? —5x+3=0, 
we tried a rearrangement such as 
x =x? + 3), 


which picks out the fixed or invariant elements of the domain of the 
function 


xr 4x3 +3) (xe R), 


ie. those elements of the domain which are equal to their images in the 
codomain, A first guess, xo, at the solution gave a new number 


x, = A(xo + 3), 


which we hoped was nearer to the exact solution, which we shall call X. 
Subsequent steps formed the sequence whose elements x, were given by 


x, = Ux?-1 +3), 


which converged to X, provided that the magnitude of a scale factor, 
determined from the function 


xr—+ ix? +3) (xe R), 
was less than unity. 


In this unit, the equation defined on the set of real numbers is replaced 
by the matrix equation 


Ax -—b=0. 


The solution, instead of being a number, is now a solution vector. The 
mapping, which we shall derive from some rearrangement of the matrix 
equation, such as 


x = Gx + Hb, 


now maps an n-dimensional vector space to an n-dimensional vector 
space. We look for the invariant vector, i.e. the vector which remains 
unchanged under the mapping. We shall search for an algorithm giving 
a sequence of vectors which converges to the solution vector. By analogy 
with our previous experience of iterative methods, we shall also look 
for some number, corresponding to the scale factor, which will give us a 
hint as to whether the sequence under investigation is convergent. A 
considerable amount of the analysis involved is beyond the scope of this 
course, but we can get a hint of what it is like by looking at an elementary 
example. We shall use an illustration of two simultaneous equations in 
two unknowns although, customarily, the method would be used only 
on much larger systems. 
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We shall examine the system of equations 
xX, + 4x, =6 
2x, + x9 =5 
The actual solution can easily be found to be 
Piya X= 1: 


that is, the solution vector is 


( 


We postpone the complete matrix approach for a moment and simply 
rearrange the equations in a couple of ways. Firstly, we solve the first 
equation for x, in terms of x, and the second for x, in terms of x, obtain- 
ing 


15 — 0.25x, 
2.5 — 0.5x; 


x2 


xy 


We now develop the sequences x, x4) where successive elements* are 
defined by 


xt) = 15 — 0,25x" 
xft) = 25 — 05x 
Let the first guess at the solution vector, 


ai 


x) 
be 
(tl 
of” 
We then get 


x = 15-025 x0=1.5 
xi) =25-05 x0=25 
and the sequence of estimated solution vectors up to the fifth term is 
0 2.5) 1.75 2.06 197 
(() ("s) (ass) fre ial : 


Intuitively, it seems that this sequence is converging to the solution 


vector 
Exercise 1 Exercise 1 
(3 minutes) 
Use the rearrangement 
xX, =6—4x, 
Xz =5—2x,, 


c 0 “ ; 
starting with the vector (‘| to obtain a sequence of five estimated solution 


vectors for the system of equations in the text. Does it appear to con- 
verge? a 


* In previous iterative processes we have used the subscript r, e.g. x, to indicate the rth 
estimate of the solution. Now, to avoid confusion with the subscripts already in use and 
with the normal use of a superscript as an index, we use a superscript in brackets, eg. x'”. 
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The last exercise indicates, as we might expect from our previous Discussion 
experience, that some rearrangements “work” and some do not. What is 
we would like to have is some means of testing a rearrangement, corre- 

sponding to the scale factor test discussed in Unit 2. For example, we 

might like to test 


x, =3 + 05x, — 2x2 
xq = 25 — x, + 0.5x2, 


which is more complicated than either of the previous two rearrange- 
ments, 


We shall start our search for a “scale factor” in the context of simul- 
taneous equations by examining the iterative process in more general 
terms. All our rearrangements have the iterative form 


xt) = Gx + Hb 


where G and H are square matrices. For instance, in the rearrangement 
for which we calculated the sequence which appeared to converge, we 
had 


x¢t)) = 2.5 — 0.5x9) 
xg*l) = 1.5 — 0.25x9 
This can be written in matrix form as 
Xft) 0 —0.5) {x 0 0.5) (6' 
eae 2 cae 0 }e) a ee 0 \(9. 
So in this case, 


| 0 -0.5 
-0.25 0 


0 °*) 


and a=| 
0.25 0 


We shall now attempt to explain what we mean by convergence of the 
sequence of estimated solution vectors. When we are solving an equation 
in R, we can work in terms of numerical errors in a single number and try 
to make this as small as possible. We try to find a rearrangement of the 
equation, and hence a function, which will make the error interval in 
which the true solution lies as small as possible. Now that we are working 
in terms of vectors, it seems natural to examine the behaviour of an 
error vector. We define the error vector for the rth iteration to be Definition 1 


=x X, 


where x” is the rth approximation to the exact solution vector X. When 
the sequence converges to X, the error vector approaches the zero vector. 
We require the error vector e"” to get “smaller”, in some sense yet to be 
defined, as r increases. 


The solution vector X must satisfy the equation 
X = GX + Hb, 

since this is merely a rearrangement of the equation 
AX —b=0, 

just as 
x= Hx? + 3) 

is a rearrangement of the equation 


x? —5x+3=0. (continued on page 20) 


Solution 1 
0 6 —14) 34) —126 
(3) (3) sk (23) 
It does not appear to converge. | 


(continued from page 19) 


From the two equations 
x) = Gx 4 Hb 
X = GX + Hb, 
we can obtain a relationship between successive error vectors. We have 
xt) — x = Gx) — Gx 
= G(x” — X), 


ert) = Geln, 
In the next section we shall use this result and establish a measure for 
convergence. At first glance, it may seem that it is easy enough to judge 
whether e*") is “smaller” than ¢"’; for instance, we can say that the 
individual elements of e*") must all be smaller in absolute magnitude 
than the corresponding elements of e". But, on reflection, this is seen to be 
unsatisfactory. What if nearly all the elements of e*! are smaller than 
the corresponding elements of e’, but a few are just a little bigger? Looking 
at the individual elements will not do: we must find a single number as 
a measure. 
Exercise 2 
Two other rearrangements of our original equations 
xX, +4x, =6 
2x, +x. =5 
are 
(i) x; = 6 — 4x, 
X2=5—2x, 
(ii) x; = 3 + 05x, — 2x, 
2 = 2.5 — x, + 0.5x2 
Write down the corresponding iteration formulas in the matrix form 
xt) = Gx + Hb, 
and calculate the first five error vectors in each case from the equation 
ett) = Gelr, 


starting with e = (3. | 
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Solution 1 


Exercise 2 
(4 minutes) 


28.2.2 “Measuring” a Vector 


To study the convergence of the iterative method we must have some 
measure by which we can test whether the error vector is getting “smaller” 
at successive iterations. We can invent a measure in any way we like, 
and convergence may well depend on the measure we choose. So we 
must give some thought to what sort of measure is reasonable. 


In the first place, we want our measure to be a single number because it is 
easy to compare numbers and to decide whether the sequence of such 
numbers, obtained from successive error vectors, is convergent to the 
number associated with the zero error vector. This means that, in the 
general case, we want to define a function which maps R" to R. 


We begin by considering some unsatisfactory measures, so that we can 
get a clearer idea of the conditions which a suitable “measure” must 
satisfy. 


Leta=| + | and Q=| - ], the zero error vector. 
4, 0 
Let us define the function 
a— a, (ae R"), 


so that a, is a “measure” of a. 


This measure is clearly unsatisfactory, for, although the sequence of a,’s 
obtained from successive error vectors may converge to zero, this fact 
does not tell us what is happening to the remaining elements in the error 
vectors. 


Consider the function 


a— Ya (@eR"). 


Then, 


and we want to see if the sequence of “measures” obtained from successive 
error vectors converges to zero. 

But again this is unsatisfactory, because, for instance, if n is even and the 
error vectors are 


for all r > N, where N is some positive integer, then e’—+ 0, but the 
error in each element of the estimated solution to our system is not 
negligible. 

The first measure is unsatisfactory because it does not take account of 
all the elements in the error vector. The second measure is unsatisfactory 
because, although it does take account of all the elements of the vector, 
we can get a misleading result if some of the elements are positive and 
some are negative. 
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Main Text 


(continued on page 22) 


Solution 28.2.1.2 
0 -4 19) 
i) tt) ) 
(i) x i ‘x +(, ‘le 


The first five error vectors are 


(' (=) (f) Bast 64a 
Bl’ \—2al \spP \—16«)’ \o4ap 
The error vectors are getting bigger by any reasonable “measure”, 


which suggests non-convergence; this agrees with our conclusions 
in Exercise 1. 


os 2 05 0 
i) ot) = “r) 
Bk os} “( ast 


The first five error vectors are 
a) [05a —2B\ [2.25% —2p) | 3.125« — 5.58 
(}) es + ar + pee (eee + 3.125p)" 
7.06254 — 98 
Bas + 7.0625p)° 


This time it is not so easy to see what is going on. For instance, if 
we choose « = /} = 1, we get the following error vectors: 


(' | - at ( | a | = mee 
yy’ \-os}’ \1.25/’ \ 0.375)’ \ 2.5625)" 
Again, we have a sequence of vectors which does not look very 
hopeful. a 


(continued from page 21) 


We can improve on the second measure in two fairly obvious ways: we 
define the functions 


n a2 
m, a— S a (ae R") 
fea 
and 
4 
m;:a'—* >, |aj| (ae R"). 


ist 


The measure defined by m, is the more obvious for two reasons: the 
measure is the same as the standard deviation of the a; from zero; also, 
in two or three dimensions, this measure can be interpreted as the length 
or modulus of the corresponding geometric vector. 


We shall look at each of these measures briefly to illustrate their use, 
restricting ourselves to the geometric interpretation in two or three 
dimensions. 
We look again at the two-dimensional example we discussed in the pre- 
vious section, and, in particular, at the rearrangement which led to the 
equation 

eth) — Gen 


where 
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Solution 28.2.1.2 


If we write 


then 
mye —+ (ef)? + (ef). 


If we represent the error vector by an arrow from the origin of a set of 


Cartesian co-ordinates to the point with co-ordinates (ef, e’), then 


m,(¢") is the length of the arrow. Now, we have 
eye 0 —0.5) [ef 
ed 3 Ves 0 (2) 


et) = -0.5 ef) 


ef*) = -0.25 ef). 


that is, 


cy 
Let us suppose that the initial error vector e = |} ; then we can draw 
PP! B 


an arrow from the origin to represent the error vector. The successive 
error vectors are then represented as follows. 


ea eee 

1 —0.58 0.250 

2 0.125« 0.1258 
42-direction 


8 


a 1-direction 


In this particular example, the error vector has one of two directions 
and alternates between them, so that, after two steps in the iteration, the 
error vector is again pointing in the same direction, but its length has 
been decreased by a factor of 8. By taking a sufficient number of steps 
we can make the length of the error vector as small as we please, ie. we 
can get the pointed end of the arrow as close to the origin as we please. 
This suggests that, with this measure of an error vector, the iteration 


P < « “ 
method will converge to a solution whatever error () there is in our 


initial guess which starts the iteration. 
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We now consider the geometrical interpretation of the second measure, 
defined by: 


ma) = ¥ |a. 
1 


With the notation as before, in two dimensions we have 
me) = |eY| + le, 


which is represented as the sum of the two lengths marked in red on the 
diagram. 


2-direction 


1-direction 


In our example, we see that the sum of the moduli for successive error 
vectors gets smaller and smaller, so that once again we have an intuitive 
concept of the error vector converging to the zero vector. 


Having gained an intuitive impression of what we mean by convergence 
for vectors, we shall now be more precise. 


A measure of the kind we have been discussing is called a norm. A norm 
is a function with certain special properties which maps the elements of 
a vector space V to R. The image of a vector v under this function is 
denoted by |{p|| (read as “the norm of yp”), 


A norm is defined to have the following properties: 
(i) |x|) > 0, for all ve V; 
ii) |\v) = Oifand only if v = 0: 


(ii) jx, + vol) < ley I] + lie2|), for all vy, v2 eV; 
(iv) {Jov, || = lal lz, ||, where « is any real number. 


We now define convergence in a vector space as follows. Let m be a norm 
on the vector space, and let 


x), (2), 


be an infinite sequence of vectors. This sequence is said to have the limit 
X if the sequence of norms of the error vectors: 


Ix — XI, ix — Xi,..- 


has limit 0. So we have defined convergence in a vector space in terms of 
the more familiar notion of convergence in R. It is not our intention to 
study norms further, so you may love them or leave them. An obvious 
next step would be to see which of our results for convergence in R can be 
extended to convergence in a vector space, but we leave this to a later 
course. We have shown that we can extend the idea of convergence in an 
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apparently satisfactory way, and that is sufficient for now. The next 
exercise asks you to prove one simple result for the norm m, introduced 
above. 


In subsequent work we shall assume that the norm is m,, since this proves 
to be more convenient in practice than m,. 

Exercise 1 

Consider the norm m, in R", and show that if the limit of the sequence 


pL De ac rw ya nai 2 


then the limit of the sequence 


x4), x2), xO), 


is X,, where 


(PY oe 


Notice that this result tells us that, if a sequence of vectors converges, 
then the sequences of corresponding elements in the vectors themselves 
converge to the corresponding elements in the limit vector. a 


Exercise 2 


This exercise is intended to give some geometric impression of m, in 
the two-dimensional case. We shall refer to it in the subsequent text. 


ey 
Lete = | be an error vector. 
ez 


(i) Draw the graph of the solution set of 
my(e) = ley] + leal = 2 


(ie. |x| + |y| = 2 in Cartesian co-ordinates). 
(HINT: Consider the four quadrants of the plane separately.) 
What does this imply about the arrow, starting at the origin, represent- 
ing the error vector? 
(ii) What region of the plane represents the solution set of 


ley! + leal < 2? 
(ie. |x] + |v] < 2). a 
Given a sequence of error vectors, we now have the means to test it for 
convergence using the norm mj in R". But this is not ideal, because we 


first have to obtain the sequence before we can test it. We know that the 
sequence is obtained from 


ert) = Ger 
if our original equation 
Ax =h 
was arranged in the form 
xi) = Gx + Hb. 


In other words, the sequence of error vectors is determined by the re- 
arrangement; in particular, it is determined by the matrix G. So we must 
now determine what conditions we can impose on G (ic. on the rearrange- 
ment) so that the resulting sequence of error vectors converges to 0. 
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Exercise t 
(5 minutes) 


Exercise 2 
(4 minutes) 


(continued on page 27) 


Solution 1 


Since the limit of x), x? is X, we know that the sequence of real 


numbers 
Has — XI, lx® — XI... 
converges to zero, 


Therefore, using our definition of the norm m,, we know that the limit 
of the sequence whose kth term is 


Ie — Xgl + 1x) — Xa) + Lal — X, 
is zero. 


We now need an obvious but unproved result, that is, if y and y are 
sequences all of whose terms are positive, and 


lim (v + pv) = 0, 


then both y and p converge to zero. (You might like to revise your 
knowledge of Unit 7, Sequences and Limits I and prove this result.) 


We can split our given sequence into n sequences, one of which is 
xf? — Xi) |x? — Xyh..- 

Since all the terms in the expression 
IP — Xa] + [x — Xa) + + [xf — XY) 


are positive, we can apply the result quoted above. Instead of two 
sequences, we have n sequences, one of which is derived from the first 
term in the bracket above, ie. 


[xf — X41, [xP — X4),... 


and this, therefore, converges to zero, i.e. the sequence 


1 

x0, x), 
converges to X,. a 
Solution 2 


(i) In the first quadrant of the plane, x > 0, y > 0 and the solution set is 
x+y=2, iey=2—x. 


In the second quadrant of the plane, x < 0, y > 0, so that the solution 
set is 


—x+y=2, ie y=2+x, 


and so on, 


e 


| eee ex 
/ 
NY 
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Solution 1 


‘Solution 2 


FM 28.2.2 


The graph of the solution set is, therefore, the square shown in red 
on the diagram. The head of the arrow lies on the square. 

(ii) The region representing the solution set is the interior of the red 
square. a 


(continued from page 25) 


Now we have met this sort of problem before. In Unit 2, Errors and 
Accuracy, we solved equations in R by rearranging them to get a con- 
vergent iterative sequence. We found that there was a criterion associated 
with the rearrangement which determined convergence: it was a condi- 
tion on the scale factor, Let us just look at this again. 


Suppose we want to solve the equation 
L(x) = 9, 

where f is a real function, and we have a rearrangement 
% = F(x) 

of this equation, leading to the iteration formula 
xt) = F(x), 

We defined the scale factor for the function F to be 


estimated error in image of x 
error in x 


error in x"*)) 
error in x" 


We took a guess, x") say, at the solution of x = F(x), and calculated 
the scale factor, which we assume changes little in the neighbourhood of 
the solution. If the magnitude of the scale factor were greater than 1, then 
the error interval containing the guess and the actual solution would 
grow, and if the magnitude of the scale factor were less than 1, then the 
error interval would shrink at each application of F, and the iterative 
procedure would converge to the solution. 


Now let us have a look at what this means in our present context. The Main Text 
scale factor here may be defined to be od 


m,(e’*") _ the norm of the estimated error in Plies 
me) the norm of the estimated error in x"” : 


Note that the scale factor cannot be negative, by our definition of a norm. 
If this scale factor is less than 1 for all r, then 


m,(e"*") < m,(e”), 
ie. 

eet + Jeg t | toes + let) < fell + le +++ + lew. 
This inequality is true for all r, so we can find a number k, 0 < k < 1, 
such that 


% . 
DlegeM<k Y let 


i=1 


<k? Y lef 
faa 


<k Y lef". 
im 
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It follows that 
lim | 
rlarge 


since ¥ |e{?) is a constant. 
= 


Now, since 0 < k < 1, lim k” = 0, so that 


large 


le) < (1m « x (Si). 


rlarge 


lim | ¥ jet] = 0. 
rtarge | 

So once again the scale factor holds the clue. If the scale factor is less 
than 1, then the sequence of norms of the error vectors converges to 
zero, i.e. the sequence of error vectors converges to the zero vector, and 


the sequence of iteration vectors x" converges to the solution X. 
We now have a criterion for convergence, viz. 


my(e"*") 


ys 
me) ~ 
We know that 
et) = Gem, 


so that this condition becomes 


m,(Ge'”) 
mle") 


and it looks as if the clue to convergence is held by G. The information 
contained in G will relate the norms of successive error vectors in some 
way. 


We shall illustrate this in the two-dimensional case. 
If 
myle"*") < km,(e), where 0 < k < 1, 


then the lengths of the sides of the squares are decreasing by a factor k 
at each step, and the arrow representing the error vector, which is trapped 
inside the appropriate square, is approaching the zero vector, ie. the 
sequence of vectors is converging. (See Exercise 2.) 


We shall now examine what the scale factor condition means in terms of 
the elements of G. We shall do the analysis in two dimensions for simplicity, 
but the arguments can easily be generalized. 


Suppose 


é ae) 
821 822 


Successive error vectors are then 


Gale ai 


so 

ef P = ge? + B28? 
and 

eft = gael? + ge. 
Thus 


mye*) = lef? | + eft) 
= [gare + 8129 + learel? + 822€8") 
< lease + ler2e¥| + leave) + Ieare9l, 
using the triangle inequality (see Unit 22, Linear Algebra 1, section 22.1.3). 
Since, for any real numbers «, p. 
la x Bl = lal x |B, 
we have 
leet + lee?) <igu alle + les alle) + igailletl + lezal lev", 
and finally 
leh") + lel) <dgial + Igaildlell + (gral + lezaldle?'. 


Therefore if we can find a number k in the range 0 < k < I such that 
both 


Igual + gail Sk 
and 
gral + [gaal Sk 


the scale factor condition is satisfied, and the sequence of error vectors 
converges to the zero vector. Thus we can test the matrix G by adding 
the moduli of the elements in cach column. If the largest column sum is 
less than 1, the sequence of error vectors converges, For example, if 


0.3 0,5 
ape 
0.2. 0.1 


then we have 
Igul + [gail = 0.5 and |gy2| + Ig22| = 0.6. 


If we take k = 0.6 (or any number between 0.6 and 1), then the scale 
factor condition is satisfied, so the iterative sequence produced by the 
matrix G converges. 


(This number k, representing the maximum sum of the moduli of the 
elements of a column of a matrix, can itself be regarded as a norm ofa 
matrix. In other words, it is a method of “measuring” or attaching a 
number to a matrix.) 
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Exercise 3 Exercise 3 
a A f (3 minutes) 
Test each of the following matrices to see if the sequence of error vectors 


obtained from the equation 
et) = Gen 


is certain to converge to the zero vector. If it is, give a suitable value for 
the constant k. 


) G a 
i = 
) 0.6 0.2 
ii) G ( 0.7 x 
ii = 
. =0:5 Or 
vs 04 03 
(iii) G= 
0.6 0.7, 
0,33 —0.72| 
iv) G= 
e lee 4 * 
Exercise 4 Exercise 4 


. —s . (3 minutes) 
Rearrange the following equations in such a way that the resulting 
iterative method converges. Hence solve the equations by iteration. 


5x, + 3x, =7 
2x, — 4x, = 3. a 


The scale factor test can be applied to a sequence of error vectors in a Discussion 
vector space of any finite dimension. We sum the moduli of the elements a 

of each column of a matrix, and if all the sums are less than or equal to k, 

which is a positive number <1, then we have guaranteed convergence, 

For example, for three dimensions, we have another geometric interpreta- 

tion. Using the norm |e,| + |e3| + |e3|, we confine the arrow representing 

the error vector inside a box in the shape of an octahedron which gets 

smaller at each step if the method converges. 


For more than three dimensions we have no geometric picture, but the 
algebra is just the same. 


Finally, we turn to a point made in the introduction to this section. We 
said that the iterative method is useful when the matrix of coefficients 
is a sparse matrix. This is so because when there are only a few non-zero 
elements in each row, we can readily isolate one of the elements of the 
solution vector and express it in terms of relatively few elements on the 


right-hand side. (There may be some manipulation required to obtain 
such an equation for each element.) For example, the arrangement 


xX, = 02x, + 0.5x5 + 1 
X = 04x, + O.1x3 + 2 
X3 = 0.2x2 + 0.3x4 + 3 
X4 = 0.5x3 + 0.2x5 + 4 
Xs = 04x, + 0.3x, + 5 


has 3 fewer multiplications on the right-hand side of each equation 
than it might have. This is because the matrix of coefficients written in the 
form 


1 —0.2 0 0 -05 
-04 1 -O.1 0 0 

0 -02 i! —0.3 0 

0 0 -05 1 —0.2 
-04 —03 0 0 1 


is relatively sparse. The fact that the number of multiplications required 
is drastically reduced would be even more pronounced if, for example, 
there were only 5 non-zero elements in each row of a 100 x 100 matrix. 
The final judgment on efficiency, compared with a direct method, is 
more difficult however, for we also have to decide how fast the iterative 
process converges, assuming it does converge; that is how many steps 
are required to obtain the required accuracy. 
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Solution 3 Solution 3 


(i) The sequence converges. Any k such that 0.95 < k < 1 is suitable. 
(ii) The sequence does not necessarily converge. 
(iii) The sequence does not necessarily converge, 

(iv) The sequence converges. Any k such that 0.99 < k < 1 is suitable. 


Note that we have only shown that 
condition satisfied > convergence 
and not that 
condition not satisfied > divergence; 


we may still have convergence in the latter case, although it may be more 
difficult to prove. Thus you might like to look at the sequence generated 


1 
by the G’s in cases (ii) and (iii) starting with ('. a 


Solution 4 Solution 4 
One such rearrangement is 

xX; = 14 —0.6x, 
~0.75 + 0.5x,. 


With this rearrangement, 


"8 Sey 
G= 
0.5 0 
and the iteration formula takes the form 
wel a es a) 
z 05° 0 ~0.75 


Starting with any x", we shall eventually get as close as we please to the 
iB iy ly Pp 


37 
tial solution) (2° ( es . 
actual solution = Fy 
oqreiaeecen ey ~0.04 
26 


FM 28,2.2 


Exercise 5 Exercise 5 


The following is a simple library program, called $ ITER which you can 
use to get solutions of simultaneous equations by iteration. You ought 
to be able to follow how it works by inspection. 


- PROGRAM COMMENT 
5 PRINT “X = AX + B, A AN N*N MATRIX” Sets the scene. 
10 PRINT “N =" 
15 INPUT N Asks for matrix size 


16 IF N < = 10 THEN 20 which must be less 


17 PRINT “N TOO LARGE GIVE A NUMBER FROM | TO 10” oa ds 

18 GO TO 10 

20 PRINT “A =" 

25 FOR l=! TON Inputs matrix A 
ON peace cepa each 
35 INPUT Ail, J) input request. 
40 NEXT J 

45 NEXT 1 

50 PRINT "B=" 

55 FOR l=! TON Inputs B. 

60 INPUT Bil) 

65 NEXT I 

70 PRINT “INITIAL X =" Requests initial 
11 FOR 1=1TON estimate X. 

72 INPUT X(I) 

73 NEXT I 


75 PRINT “WHAT NEXT? REPLY WITH —1 IF FURTHER ITERATIONS 
ARE REQUIRED” 


76 PRINT “+1 IF A NEW MATRIX IS TO BE INSERTED AND 0 TO 
TERMINATE” 


80 INPUT X 

85 IF X = —1 THEN 110 

90 IF X= +1 THEN 5 

95 IF X =0 THEN 300 

100 PRINT “REPLY MUST BE ~—1, +1 OR 0” 

105 GO TO 80 

110 PRINT “HOW MANY ITERATIONS?” 

115 INPUT M Makes sure that no 
120 IF M <2! THEN 135 more than 20 iterations 
i ie are requested, 

125 PRINT “GIVE A NUMBER FROM | TO 20” 

130 GO TO 115 


PROGRAM 
135 FOR K=1TOM 
140 FOR I= 1 TO N 
141 Y(I) = B(I) 
142 NEXT I 
145 FOR I=1 TON 
150 FOR J=1 TON 
155 IF A(I, J)=0 THEN 165 
160 LET Y(I) = Y(I) + A(l, J) * XQ) 
165 NEXT J 
170 NEXT I 
175 FOR I=1 TON 
176 X(I) = Y(I) 
177 NEXT I 
180 NEXT K 
185 PRINT “X=” 
190 FOR I=1 TON 
192 PRINT X(I), 
194 NEXT I 
196 GO TO 75 
300 END 


(i) As it stands, this program cannot tell you whether the matrix A 
which you input makes the iteration likely to converge. Write a set 
of instructions which will modify the program so that it will tell you 

" 


the maximum value of ¥ |a,J for j = 
fox 


unless this value is less than 1. 


(ii) Use the program to solve the following 5 x 5 system to two decimal 


places. 
x, — 0.2x, 
—0.4x, + L1x, — 0.1x5 


—0.2x,+ x3 —0.3x4 


— 0.5x5 


y-+.,m, and refuse to iterate 


1 
=2 


=3 


— 0.5x3 + 1.2x, —0.2x5 = 4 


—0.4x, — 0.3x 
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ie 


xs=5. 


COMMENT 


Performs the iterations. 


Prints results. 
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28.2.3 Summary 28.2.3 


To solve the system of equations represented in matrix form by the ‘Summary 
equation a 


Ax = b, 
we rearrange it in the form 
x = Gx + Hb. 


This leads to a possible iterative sequence defined by 
x= Gx0+ Hb (r= 1,2...) 


which will converge if the sum of the moduli of the elements in each 
column of G is less than 1. We proved this result by examining the be- 
haviour of the error vectors 


= x -X (r= 1,2...) 
which satisfy 
et) = Gen (r= 


where X is the exact solution vector. 
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Solution 28.2.2.5 Solution 28.2,2.5 
(i) A suitable set of instructions would be: 
46 GOTO 200 
200 LET P=0 
205 FOR J=1 TON 
210 LET Z=0 
215 FOR I=1 TON 
220 LET Z = Z + ABS(Ai(I, J)) 
225 NEXT I 
230 IF Z < P THEN 240 
235 LET P=Z 
240 NEXT J 
245 PRINT “MATRIX NORM ="; P 
250 IF P <1 THEN 50 
255 PRINT “MATRIX NORM TOO BIG” 
260 te. 7 
(ii) A suitable rearrangement of this is 
0 0.2 tt) 0 0.5 1 
04 —-O1 01 0 0 
s=| 0 02 0 0.3 0 jx+ 
0 0 05 —02, 0. 
04 0.3 0 0 0 
The solution to two decimal places is 
6.44 
4.72 
X= | 6.16 
7.40 
8,99 a 


2 


Uk w 
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28.3. ILL-CONDITIONED SYSTEMS OF 
EQUATIONS 


In this section we look at a particularly disastrous way in which errors 
can sometimes accumulate when we solve simultaneous equations based 
on inexact data or when we introduce rounding errors during solution. 
In fact, the error accumulation may even render the results of solving 
the simultaneous equations virtually useless. When small changes in the 
data have a large effect on the solution of a system of equations, then we 
say that the system is ill-conditioned. This term has no precise definition 
but is used in the same relative sense that adjectives like “small” are 
used in English. “Small changes in the data” producing “large changes 
in the result” is simply one way of describing the phenomenon of ill- 
conditioning; the adjectives used indicate the imprecision of the concept. 
A numerical illustration of ill-conditioning is given in the following 
example. 


Example | 


In a reconstruction of a crime, bullet holes centred at A and B were found 
in a double-glazed window at distances apart indicated on the diagram, 
the measurements being taken to the centres of the holes. All the measure- 
ments were taken to the nearest centimetre. Other evidence showed that 
the gun was at the level CD when fired. How accurately can we determine 
the position from which the gun was fired? 


Before following through the solution to this example, you may find it 
helpful to assume that the measurement of 10cm is exact and draw 
accurately, on a piece of graph paper, the two most extreme possible 
trajectories of the bullet, assuming them to be straight lines. 


Solution of Example 1 


If we fit x, and x, co-ordinate axes on to the diagram, we are effectively 
finding the intersection of the straight line AB with the x,-axis CD. 


—— sheets of glass 
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Example | 


The equation of the straight line AB is 

xX, = mx, + ¢, 
where (in theory) we can find m and c from the data. The distance xg 
from C at which the bullet was fired is the value of x, when x, = 0, ie. 


X= -— 
m 


If we assume the data to be exact, then we know that AB passes through 
the points (0, 20) and (10, 18), so that 


2=c 
18 = 10m + ¢, 
20 
whence m = —0,2. Hence xg = mm = 100cm = 1m. 


But the data are not exact: c could be 19.5 and then m could be as small 
as the value given by the equation 


18.5 = 10.5m + 19.5, 


= 
10.5" 
so that xg could be as large as 19.5 x 10,5cm = 204.75 cm. That is, the 


approximate value for xg (1m) could be over | m in error. In fact, all 
we can say is that 


Xq € (64.92, 204.75] cm. 


In other words, the distance the gun was fired from the window could 
be anything between just over 0.5 m and just over 2 m. This is not a very 
accurate result when you consider the apparent accuracy of the original 
measurements, a 


m= 


In geometric terms, in the last example we were trying to find the inter- 
section of a pair of nearly parallel straight lines (the x,-axis was one of 
these). If there is a small error in the original data, it can result in a con- 
siderable error in the solution. One possible way of envisaging what is 
happening is to imagine two torches, or searchlights, with their beams 
crossed, 


The exact solution, if the data are exact, is represented by a point P. 
With inaccuracies in the data, represented by the diverging beams from 
the torches, the straight lines could be anywhere within the beams. The 
point representing the true solution could be anywhere within the “area 
of intersection” which is shaded in the diagram. 
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(See RB3) 
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Intuitively, the more nearly parallel to each other the axes of the torch 
beams become, the larger the shaded area, and the further away the 
true solution can be from the approximate solution. 


Now let us look at a more general case in two dimensions in the following 
exercise. 


Exercise | Exercise 1 
Solve the following simultaneous equations for x, and x,: as 
Xyt+ x2,=1 


2, 


Xy + (1 + @)x2 


where ¢ is some real number, by the Gauss elimination method. Write 
down the solutions corresponding to 


(i) ¢ = 0.01, 0.02 
(ii) ¢ = 2.01, 2.04 


respectively. Compare the percentage changes in the coefficient of x in 
the second equation for cases (i) and (ii) with the percentage changes in 
the corresponding solution for x. cy 


This last exercise was rather hypothetical. However, the rounding errors Discussion 
which occur both in the initial data from a practical problem and in the 
solution process itself are not hypothetical. These small rounding errors 
can readily lead, in ill-conditioned equations, to highly inaccurate results. 
A simple way to test for such ill-conditioning is to do what we did in the 
last exercise, that is, to make a small change in some coefficients to see 
what effects there are on the solution, but this is difficult to do with larger 
sets of equations, particularly when a solution set looks acceptable in 
the physical sense. There are more sophisticated methods which give a 
practical indication when ill-conditioning is present, but most of these 
will involve us in too many technicalities. We shall describe just one 
such method, 


Theoretically, ill-conditioning can be described in the following way. 
Given a system of simultaneous equations in matrix form 


Ax = b, 


the system is ill-conditioned if the n columns of the matrix are almost 
linearly dependent (note the vagueness again), or, in other words, if the 
matrix is nearly singular. This means that all the elements of A are close 
to the corresponding elements of some singular matrix. In the last exercise, 
for example, A would be 


baad 
1 t+e) 


Clearly if e were small (we found this made the equations ill-conditioned), 
a small change, that is a change of —e in a,,, would turn the matrix 
into the singular matrix 


in which the two columns are obviously linearly dependent. 


Exercise 2 Exercise 2 
Determine the inverse matrix A~' of eae 
1 1 
A= & 
te ae 
What are the elements of A~! if e = 0.017 | (continued on page 40) 
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Solution 1 
xy+ x, =1 


x, +(1+8)x, =2 


Carrying out the Gauss elimination method gives 
xX) +x, =1 


éx, = 1, 


which gives 


(i) (ii) 


& 0.01 0.02 2.01 2.04 


Solution to 2 
decimal places |(—99, 100) (—49, 50) | (0.50, 0.50) | (0.51, 0.49) 


Percentage change of coefficient in x, is 1% to one place of decimals in 
each case. 


In (i) the solution x, changes by 50%; in (ii) by 2%. This indicates that 
“when « is small the equations are ill-conditioned” means in this case 


“small compared with unity”. a 
Solution 2 
eae 
6 & 
Abs 
1 1 
é é/ 
| 101 —100) 
—100 100; a 


(continued from page 39) 


This last exercise shows one feature of the inverse of the matrix. of 
coefficients of an ill-conditioned system of equations ; that is, its elements 
are relatively large when compared with the elements of the original 
matrix. This leads to one final point about systems of equations which 
may be ill-conditioned. A recommended and useful means of checking 
the accuracy of an estimated solution x, of 


Ax=b 
is to premultiply x, by A and obtain a vector b,. That is, 
Ax, = by. 


If the elements of b, differ very little from the elements of b, then we would 
normally assume that the solution is fairly accurate. (Remember that 
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Solution 1 


Solution 2 
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there will almost always be some inaccuracies from the rounding errors 
in a real computation of any length.) If the system of equations is ill- 
conditioned, however, the solutions can still be grossly in error; for if 
we subtract the two equations above we get 


A(x, — x) = by — b, 
or, using the error vector notation, 
Ae=E 
where 
e=x,-x and E=b,—b 
Then we shall have 
e=A'E. 


If A~! contains large elements, it is intuitively obvious that e can be 
large even though E is small. So, whenever we suspect ill-conditioning, 
it is worth having a look at the size of the elements of A~* relative to 
the elements of 4, 


Exercise 3 Exercise 3 
(4 minutes) 
If 
i 
1.01)" 


find an E with norm (sum of the moduli of the elements) <0,01 which 
leads to an e with norm >1. t 


Finally, there is the problem of what to do about ill-conditioning when Discussion 
it occurs and we do recognize it. It may well be that reformulation of the 
problem in terms of different variables, as may be possible with sets of 
ill-conditioned simultaneous equations arising from problems involving 
engineering structures, will lead to a well-conditioned set of equations. 
Otherwise we can use double precision arithmetic in the calculation, 
that is, carry twice as many digits throughout the work, in an attempt to 
get better results, but if the equations are badly ill-conditioned this will 
still be of no avail. Then the advice is —forget the problem, or quote 
the result to whatever accuracy is obtained, however bad. 


Summary 


When the matrix of coefficients is nearly singular, that is, a small change Summary 
in some elements of the matrix will produce a singular matrix, the matrix 
equation 

Ax=b 


is said to be ill-conditioned. This usually means that the solution is highly 
unreliable, particularly if the original data, from which the matrix equa- 
tion arose, were inexact. 
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Solution 3 Solution 3 
One example is 

—0.004' 

es | 0.004, 
Then m,(E) = 0.008 < 0.01 and 
101 +—100\ /{ —0.004' —0.804) 

~100 iol | . | 08 
with m,(e) = 1.604. a 


28.4 CONCLUSION 


In this text we have examined the practical problem of how systems of 
simultaneous linear equations, particularly large systems, can be solved. 


In section 28.3 we drew attention to the phenomenon of ill-conditioning 
which can arise in such systems of equations, and which may make it 
very difficult, or even impossible, to find any reliable or useful solution 
to the equations. 


We have demonstrated the basic direct and indirect methods of solution. 
Many refinements grow out of these and can be found, for example, in 
Fox, Numerical Linear Algebra (see Bibliography). All we have done here 
is to examine the basic methods and to look at their efficiency and 
accuracy. 


Postscript 


“That’s carrying things a step too far, I draw the line at that.” 
Harry B. Smith, 
We Draw the Line at That 
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Notation Page 
The symbols are presented in the order in which they appear in the text. 
A A non-singular matrix of coefficients. 1 
x A vector. 1 
Cc The set of all complex numbers. 2 
A! The inverse of A. 2 
x; An element of the vector x. 3 
a, Anelement of the matrix of coefficients A. 3 

8 


x Therth approximation to the solution vector in an iterative method. 1 

xf? An element of x". 18 

X The exact solution vector. 19 

e The error vector: 19 
(x — X). 

ef Anelement of e"”. 23 

\a\) A-normrof the vector a defined by 24 


Wall = lay] + laal + +> + lal, 


where qis ann x 1 vector with elements a,,i = 1,..-,” 


vii 


The program has been designed so that you can follow the individual steps fairly easily; it is, 
therefore, neither the most elegant nor efficient program possible. 


